How Are Spelling Errors Generated and Corrected? A Study of Corrected and Uncorrected Spelling Errors Using Keystroke Logs
نویسندگان
چکیده
This paper presents a comparative study of spelling errors that are corrected as you type, vs. those that remain uncorrected. First, we generate naturally occurring online error correction data by logging users’ keystrokes, and by automatically deriving preand postcorrection strings from them. We then perform an analysis of this data against the errors that remain in the final text as well as across languages. Our analysis shows a clear distinction between the types of errors that are generated and those that remain uncorrected, as well as across languages.
منابع مشابه
Context-Sensitive Spelling Correction of Consumer-Generated Content on Health Care
BACKGROUND Consumer-generated content, such as postings on social media websites, can serve as an ideal source of information for studying health care from a consumer's perspective. However, consumer-generated content on health care topics often contains spelling errors, which, if not corrected, will be obstacles for downstream computer-based text analysis. OBJECTIVE In this study, we propose...
متن کاملTime of Memorization and English Spelling Difficulties among Iranian EFL Students in Malaysia
AbstractIn this study, phonological, morphological, and orthographical spelling difficulties were identified to examine the correlation between spelling difficulties and the time taken to memorize the spelling of words (time of memorization) among Iranian EFL students in Malaysia. The participants were 41 Iranian EFL students (20 male and 21 female) who were selected purposefully from an Irania...
متن کاملSpelling Errors of Iranian School-Level EFL Learners: Potential Sources
With the purpose of examining the sources of spelling errors of Iranian school level EFL learners, the present researchers analyzed the dictation samples of 51 Iranian senior and junior high school male and female students majoring at an Iranian school in Baku, Azerbaijan. The content analysis of the data revealed three main sources (intralingual, interlingual, and unique) with seven patterns o...
متن کاملDesign and implementation of Persian spelling detection and correction system based on Semantic
Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors. Also developing Persian tools will provide Persian progr...
متن کاملارائه یک رتبهبند برای خطایاب معنایی با استفاده از ویژگیهای حساس به متن
Nowadays, a large volume of documents is generated daily. These documents generated by different persons, thus, the documents contain spelling errors. These spelling errors cause quality of the documents are decrease. Therefore, existence of automatic writing assistance tools such as spell checker/corrector can help to improve their quality. Context-sensitive are misspelled words that have been...
متن کامل